We are going to generate some music with more than one synthesizer
We will filter out the lead tone using a feed-forward neural network.
Model: input wave with 3 instruments -> output wave 1 instrument
We will use a auto-encoder like setup. Replace the image by a short fragment of 1024 samples (~1/40th of a second) of sound data.

Audio(input_track[0:8*sr], rate=sr)
Audio(target_track[0:8*sr], rate=sr)

model = keras.models.Sequential()
model.add(Dense(1024, input_shape=input_shape))
model.add(PReLU())
model.add(Dense(512))
model.add(PReLU())
model.add(Dense(output_shape))
model.compile(Adam(), 'mse')
model.summary()
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_1 (Dense) (None, 1024) 1049600 _________________________________________________________________ p_re_lu_1 (PReLU) (None, 1024) 1024 _________________________________________________________________ dense_2 (Dense) (None, 512) 524800 _________________________________________________________________ p_re_lu_2 (PReLU) (None, 512) 512 _________________________________________________________________ dense_3 (Dense) (None, 1024) 525312 ================================================================= Total params: 2,101,248 Trainable params: 2,101,248 Non-trainable params: 0 _________________________________________________________________
Audio(model_predict(model, mix)[0:15*sr], rate=sr)
model.fit(x, y, epochs=2)
Epoch 1/2 40960/40960 [==============================] - 20s 482us/step - loss: 0.0047 Epoch 2/2 40960/40960 [==============================] - 20s 494us/step - loss: 0.0029
<keras.callbacks.History at 0x1272d4b70>
display(Audio(mix[40*sr:45*sr], rate=sr))
display(Audio(model_predict(model, mix)[40*sr:45*sr], rate=sr))
score_tracks_test, audio_tracks_test, mix_test = \
generate_dataset(n_measures=64,
tempo=Tempo(120),
scale=GenericScale('E', [0, 1, 4, 5, 7, 8, 10]),
sampling_info=sampling_info)
Audio(model_predict(model, mix_test[0:15*44100]), rate=sr)